Bioinformatics analysis of immune-related prognostic genes and immunotherapy in renal clear cell carcinoma

Clear cell renal cell carcinoma (ccRCC) is an immunogenic tumor, and investigating the immunorelated genes is essential. To investigate the immunoprognostic genes of ccRCC, we analyzed the data assimilated from a public database (The Cancer Genome Atlas (TCGA) database and the gene expression omnibus (GEO) database) using bioinformatics. Then, an immunoprognosis model was constructed to identify four hub genes with moderate predictive values for the prognosis of ccRCC patients. These four genes were associated with the prognosis of ccRCC patients based on Oncomine and Gena Expression Profiling Interactive Analysis (GEPIA) databases. The correlation analysis between the immune infiltrate, immune checkpoints, and immunotherapy and this immunoprognosis model showed that immune infiltration could predict the immunotherapy effects. We also conducted a quantitative real-time polymerase chain reaction analysis and found that the expressions of three hub genes were associated with tumor progression (P<0.1). In conclusion, four genes that may serve as potential biomarkers in ccRCC were identified with respect to prognosis.


Introduction
ccRCC is an immunogenic tumor and benefits from immunotherapy. Ipilimumab, an anti-CTLA-4 antibody, combined with nivolumab, showed a response rate of about 40% in the CheckMate 016 trial [1]. Recent advances in bioinformatics have focused on the correlation between ccRCC prognosis and immunity. The targeted immune cell checkpoints are one of the most effective methods to activate an anti-tumor immune response. Cytotoxic T lymphocyte-associated protein 4 (CTLA-4) and programmed cell death protein 1 (PD-1) are two common immune checkpoints. Several studies reported immuoprognostic models (IPMs). Luo et al. [2] revealed three IPMs to assess the prognosis of metastatic colorectal cancer, obtained at 1, 3, and 5 years; the area under the curve (AUC) was 0.646, 0.7, and 0.692, respectively. Moreover, the tumor-infiltrating immune cells (TIICs) [3]  progression and have a potential prognostic value. A previous study showed that intratumoral immune cell infiltration is associated with the outcomes of cancer patients. Choueiri et al. [4] reported that kidney cancer patients with a high abundance of CD8+T cells have a poor prognosis, while Bromwich et al. [5] indicated that CD4+T cell infiltration was associated with poor survival. Presently, the mechanism of how immunity affects the occurrence and development of ccRCC is unclear. Thus, the current study aimed to establish a correlation between immunity and the occurrence and development of ccRCC. This study analyzed the immunerelated differentially expressed genes (DEGs) associated with ccRCC prognosis. These findings would be beneficial in improving the prognosis of patients with ccRCC.

Data and database analysis
Data were obtained from TCGA and GSE53757 in the GEO database [6]. The GSE53757 source files were transformed into log2 and normalized to expression matrices. DEGs were calculated using the R package "Limma" to differentiate between ccRCC and normal kidney samples. Subsequently, the DEGs were screened if the log2fold-change was >1 and the false discovery rate (FDR) was 0.05. TCGA data after pretreatment using the "edgeR" R package for variance analysis were set to several times the expression of genes, the amount of correction after each gene expression to >1, the domain value as |log2fold-change|>2, and FDR<0.05 for screening the DEGs. The "VennDiagram" package was used to identify the intersection of the DEGs obtained from TCGA and GEO databases and the immune-related genes (IRGs) downloaded from the Import database.

Function enrichment analysis of DEGs
We used the R packages of "clusterprofiler," "enrichplot," and "ggplot2" for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Significant P-and Q-values were set to 0.05.

Prognostic-related survival model
Using IRGS, univariate Cox risk regression, and least absolute shrinkage and selection operator (LASSO) regression analysis (Simple Random sampling of "caret" package with R software was performed at the ratio of 2:1 for the training and test groups), multivariate Cox risk regression was analyzed to build a risk prognosis model, and survival-related hub genes were screened out. P<0.001 was statistically significant in the univariate Cox regression. The "Glmnet" package of R was used to set 1000-fold replacement and 10-fold cross-validation for LASSO regression. Kaplan-Meier survival plots showed the prognosis of ccRCC patients. the AUC was calculated using the R package "survivalROC." Oncomine and GEPIA databases [7] were used to analyze the prognostic value of DEGs In order to verify the hub genes, five qualified databases were obtained from Oncomine [8] (www.oncomine.org/), followed by a meta-analysis. The filter conditions were as follows: the cancer type was ccRCC, the sample type was mRNA, and the analysis type was a case-control study. The threshold was set at 1e-4, the fold-change to 2, and the gene ranked up to 10%. Then, the GEPIA database (http://gepia.cancer-pku.cn/) was analyzed to determine the survival prognosis of hub genes.

Immunoinfiltration analysis of prognostic genes
Cibersort [9] is a deconvolution algorithm that quantifies the cell components from a large number of tissue gene expression profiles. Because of prior knowledge of the expression profiles of purified leukocyte subpopulations, Cibersort can accurately estimate the tumor tissue's immune composition.

Genetic alteration in genes key related to prognosis
The mutation data were obtained from the TCGA database. Next, we analyzed the genetic changes in different risk groups and their correlation with prognosis.

Immunotherapy and immune targets of prognostic genes associated with ccRCC
We investigated the differential expression of PD-1 and CTLA-4 genes in patients in different risk groups. In addition, we downloaded 4 (kidney renal clear cell carcinoma) KIRC files related to immunotherapy from the cancer immunoAtlas [10] (https://tica.at/). These included ctla4_neg_pd1_neg, ctla4_neg_pd1_pos, ctla4_pos_pd1_neg, and ctla4_pos_pd1_pos. Under the four conditions above, the Wilcoxon test was used to evaluate the efficacy of immunotherapy in the two risk groups.

RNA extraction and quantitative real-time polymerase chain reaction (qRT-PCR)
In this study, 12 pairs of ccRCC and para-cancerous tissues were collected from Tongji Hospital of Huazhong University of Science and Technology (Wuhan, China) between January 2022 and May 2022. Total RNA was isolated from ccRCC and normal kidney cells using RNAeasy TM Isolation Reagent Vazyme Cat (Vazyme, Nanjing, China). A reverse transcription kit (Vazyme) was used to convert the mRNA into cDNA. qRT-PCR was carried out with a SYBR Green kit R701-02 (TransGen Biotech, Beijing, China) in a final volume of 10 mL to detect the expression level of these genes. The data were analyzed using StepOne software (Thermo Fisher Scientific, China). GAPDH was used as a reference gene to calculate the relative expression levels using 2 -DDCT . Primer sequences are listed in the S1 Table. Statistical analysis R packages "edgeR" and "Limma" were used to analyze the differential genes. Two groups were compared using the Wilcoxon test. Survival analysis was carried out using Kaplan-Meier method and the log-rank test. The survival-related receiver operating characteristic (ROC) package was utilized to draw the ROC curve. The chi-square test was used to analyze the differences between the two groups in terms of clinical parameters, and the difference was considered statistically significant at P<0.05. All statistical analyses were performed using GraphPad Prism 9 software (GraphPad Software, San Diego, CA, USA).

Characteristics and functional clustering of 128 differentially expressed IRGs in ccRCC
A total of 1777 DEGs were obtained from the TCGA database (Fig 1A), and 2938 DEGs were obtained from the GEO database ( Fig 1B). These were pooled with 2483 IRGs, and finally, 128 IRGs ( Fig 1C and Table 1) overlapping in both databases were identified, including 102 upregulated and 26 downregulated IRGs. GO ( Fig 1D) and KEGG ( Fig 1E) enrichment analyses of the above IRGs showed a response to chemokines, external side plasma membrane, and receptor-ligand activity, which are the most common biological terms for BP, CC, and MF. The most enriched form in KEGG pathway analysis was the cytokine-cytokine receptor interaction. The study schematic is illustrated in Fig 2.

Establishment and validation of prognostic risk models of the four hub genes
In order to identify the key IRGs in the occurrence and development of ccRCC and use them as potential predictive targets, 128 IRGs were evaluated by univariate Cox regression analysis.
The median of the risk score calculated using the IRGPI-based risk prognosis model was 0.9243977, and the samples downloaded from the TCGA database were divided into low-and high-risk groups. To verify the reliability of the model, we analyzed the outcomes and found that the low-risk group had better outcomes than the high-risk group (P<0.001; Fig 4A-4C). The ROC curve showed that our models have a moderate predictive value (Fig 4D-4F). We also analyzed the correlation between four hub genes and clinically relevant indicators and found that the gender, tumor grade, and tumor stage were correlated with risk score (P<0.05; S2B-S2D Fig). Additionally, the four hub genes were significantly upregulated in the tumor group based on the GEO database (P<0.001; Fig 5A). Then, we conducted a meta-analysis of four hub genes based on the results from the Oncomine database, and the results were consistent with previous results (P<0.05; Fig 5B-5E). To verify whether these four hub genes are related to prognosis, survival analysis was performed using GEPIA tools (http://gepia.cancer-pku.cn/); the expression of the genes was negatively correlated to tumor prognosis (P<0.05; Fig 5F-5I).

PLOS ONE
prognosis. Moreover, These data suggested that the immune status of ccRCC can be predicted based on four hub genes.

Mutation load was significantly different between the two risk groups
A high mutation load and increase in neoantigens were associated with the clinical effects of the immune checkpoint inhibitor therapy. Therefore, we analyzed the genetic changes in the two risk groups of samples (S2A and S2B Fig) and found the primary mutation in the highrisk group in Missense_Mutation, In_Frame_Del, Frame_Shift_Ins, Translation_Start_Site, Frame_Shift_Del, In_Frame_Ins, Nonsense_Mutation, and Multi_Hit in three genes, VHL, PBRM1, and SETD2. The main mutation points in the low-risk group were Nonsense_Mutation, Translation_Start_Site, Frame_shift_Del, In_Frame_DeI, Frame_Shift_Ins, Nonstop_-Mutation, Missense_Mutation, and Multi_Hit in three genes: VHL, PBRM1, and TTN. We also analyzed the differences in the tumor mutation load expression in the two groups (S2C Fig) and found that the high-risk group had a higher tumor mutation load than the low-risk group.

Expression of immunotarget-related genes and the efficacy of immunotherapy differ significantly in the two risk groups
We analyzed the differential expression of PD-1 and CTLA-4 in both groups and found that PD-1 and CTLA-4 were overexpressed in the high-risk group (Fig 7A and 7B). Next, to demonstrate the effectiveness of our survival prognosis model, we analyzed the efficacy of immunotherapy in the two risk groups under four different conditions (Fig 7C-7F). The

PLOS ONE
immunotherapy efficacy of patients in the high-risk group was significantly stronger when they received either CTLA-4-positive or both treatments. The findings indicated that immunotherapy is more effective in high-risk ccRCC than in the low-risk group.

Expression of four hub genes by qRT-PCR at the cellular and tissue levels of ccRCC
According to qRT-PCR analysis, BIRC5 and CXCL5 mRNA levels were significantly higher in ccRCC than in the paracancer tissues (P<0.05; Fig 8B-8C). Furthermore, compared to the normal renal epithelial cell line HK-2, the ccRCC cell line 786O expressed significantly more BIRC5 (P<0.01; Fig 8F). Due to the short postoperative follow-up time of the collected cases, we analyzed the correlation between the qRT-PCR expression values of the four hub genes and clinicopathology ( Table 3) and found that the expression values of three hub genes (RNASE2, BIRC5, and GNRH1) were associated with tumor progression (P<0.1).

Discussion
A total of 128 IRGs differentially expressed in ccRCC tissues compared to normal tissues were identified from TCGA and GEO databases. Subsequently, functional enrichment analysis https://doi.org/10.1371/journal.pone.0272542.g008

PLOS ONE
found that these genes were mainly involved in inflammation and immunity. Univariate Cox, LASSO, and multivariate Cox regression analyses revealed four prognostic hub genes (RNASE2, BIRC5, CXCL5, and GNRH1), and a prognosis model was established. We also verified the reliability of the model and found that the four hub genes were closely related to the prognosis of ccRCC patients. To further analyze the reliability of the model, we evaluated the expression of immune infiltrate and tumor mutation load, IRGs, and immunotherapy effect in the two risk groups and found that the four hub genes could predict the immune status of ccRCC patients. These results demonstrated that BIRC5 expression was significantly higher in ccRCC and tumor cell lines than in paracancer tissues and normal cell lines. The expression of CXCL5 was significantly higher in tumor tissue than paracancer tissues; however, the expression of CXCL5 in 786O and HK-2 cells did not differ significantly, which might be because the cancer cell lines do not represent the gene expression of tumor tissues. Also, GNRH1 and RENSE2 expressions did not differ significantly between the tissues and cells. Next, we evaluated whether the expression values of these hub genes could affect the prognosis. Due to the short follow-up period, we analyzed the correlation between clinicopathology and the gene expression values and found that tumor grade was correlated with the expression values of the three hub genes. However, due to the small sample size and short follow-up time, the association was not statistically significant (P>0.05), necessitating further exploration. BIRC5 (Survivin) is strongly expressed in most human tumors and is closely related to tumor progression, recurrence, and chemotherapy resistance [11]. It is also associated with poor prognosis and progression of ccRCC [12,13]. Current studies have shown that RNASE2, a member of the pancreatic ribonuclease family, is a useful prognostic predictor of ccRCC [14]. Among cancer patients, CXCL5 [15] is closely related to survival time, recurrence, and metastasis and is associated with a poor prognosis in ccRCC [16]. Microslaw et al. [17] demonstrated that the expression levels of chorionic gonadotropin beta subunit and GnRH1 in the blood of cancer patients might be valuable in indicating tumor metastasis and spread. Wu et al. [18] demonstrated that GnRH1 and LTB4R are prognostic biomarkers for patients with ccRCC.
Reportedly, a risk model can predict the prognosis of patients. For example, Hua et al. [19] combined multiple databases to build a survival model of immune-related prognostic genes, which could accurately predict the prognosis of ccRCC patients. Wan et al. [14] used 7 IRGs to establish a regression model for predicting the prognosis of ccRCC patients and found that the risk model could accurately distinguish patients with different survival outcomes and identify those with high and low mutation loads. Notably, our model could also predict patient prognosis. The four hub genes were considered independent prognostic factors related to the clinicopathological characteristics.
In addition, we analyzed the correlation between the risk model and immune filtration, immune target, and immunotherapy and found that the four genes predicted the immune status. These analyses demonstrated a poor response of the high-risk group to immunosuppression.
In a study by Zhang et al. [20], low levels of resting dendritic cell infiltration were observed in the high mutation load (TMB) group, and high levels of TMB were associated with low survival rates, while low resting dendritic cell infiltration was associated with poor prognosis. Castle et al. [21] reported that dendritic cells activate T cells in patients and increase cytotoxic T cell production, exerting anti-tumor effects. Basile et al. [22] demonstrated that high tumorinfiltrating neutrophils indicated poor overall survival in ccRCC patients. A previous study found that in patients with ccRCC mast cells [23] and high infiltration, the immune treatment may be effective with an improved prognosis. Plasma cells and T cells are crucial in tumor immunity. In the current study, neutrophils, CD4 memory T cells, follicular helper T cells, and regulatory T cells were abundant in the high-risk group, while resting mast cells were low. We further analyzed the prognosis of patients with varying content of immune cells and found that the patients with a high number of resting mast cells had better outcomes than those with low numbers. On the other hand, the patients with low plasma cells, follicular helper T cells, and regulatory T cells had a poor prognosis.
Previous studies showed that the tumor mutation load is positively related to the presence of a novel antigen that significantly enhances the efficacy of immunotherapy. Subsequently, we tested our model and found that the high-risk group had a high mutation load; these results were in agreement with those of Hua et al. [19]. According to the immunotherapy results, we found that the high-risk group had better immunotherapy efficacy. These observations further confirmed the functional correctness of our prognosis model based on the four hub genes.
However, the current study differs from the previous studies in some aspects. First, we combined multiple databases to construct the survival models of four hub genes related to prognosis. Then, we performed internal and external validation, and the qRT-PCR results showed that the two hub genes were differentially expressed in the tumor and normal tissues. Subsequently, we analyzed the immune cell content in different risk groups and the prognosis of patients with varied immune cell content. Finally, we analyzed the differential expression of immune targets, PD-1 and CTLA-4, in the two risk groups in the survival model constructed using the four hub genes and verified the data related to the treatment using the two immune targets downloaded from the public database, proving the accuracy of the model function.
Nevertheless, this study has some limitations. The first is the limited sample size; hence, the findings require further substantiation using multiple samples and multicenter clinical trials. Second, the exact mechanism by which these four genes influence the prognosis of ccRCC patients is yet unknown. In future studies, we need to prospectively follow up on ccRCC patients and detect the correlation of these four genes to confirm the accuracy of the predictive model on the occurrence, development, and prognosis of ccRCC.

Conclusions
We used four hub genes to construct a prognostic risk model to verify its accuracy. As a prognostic marker, the risk score of this model might distinguish the immune status, immunotherapy effect, and prognosis of patients with different risk scores.